1 S. cerevisiae sample estimation of all samples (old and new)

This document should make clear the suitability of our yeast data for differential expression analyses. It should also give some ideas about the depth and distribution of the data.

2 Gathering samples

Later, we will likely choose to exclude one of the three experimental batches in the data. Therefore, I am making three expressionsets, one with all data, and one the various subsets.

merged_expt <- sm(create_expt(metadata="sample_sheets/all_samples.xlsx",
                              gene_info=sc_all_annotations,
                              file_column="allfile"))
## Don't forget, I need to change the condition names.
only_old <- subset_expt(merged_expt, subset="batch=='E1'")
## There were 24, now there are 8 samples.
merged_new <- subset_expt(merged_expt, subset="batch!='E1'")
## There were 24, now there are 16 samples.
merged_nor <- subset_expt(merged_expt, subset="batch!='E2B1'")
## There were 24, now there are 16 samples.
merged_nos <- subset_expt(merged_expt, subset="batch!='E2B2'")
## There were 24, now there are 16 samples.

3 Visualizing raw data

There are lots of methods we have to examine raw data and explore stuff like batch effects or non-canonical distributions or skewed counts. hpgltools provides some functionality to make this process easier. The graphs shown below and many more are generated with the wrapper ‘graph_metrics()’, when invoked it provides a list of plots including: library sizes, the number of non-zero genes, density/box plots by sample, distance/correlation heatmaps, standard median correlation/distance, PCA/TSNE clustering, top-n genes, a legend describing the colors/symbols used, and some tables describing the data. Optionally, it can also perform quantile-quantile plots showing the distribution of each sample vs. the median of all samples and MA plots of the same.

Caveat: some plots do not work well with gene IDs that are all-0, thus I first filter the data to remove them.

I also added a neat function ‘plot_libsize_prepost()’ which shows how many genes are poorly represented before/after filtering the data.

The plots printed here are all metrics which are useful when considering raw data.

merged_filt <- sm(normalize_expt(merged_expt, filter=TRUE))
merged_metrics <- sm(graph_metrics(merged_expt))

pp(file="illustrator_input/01_legend.pdf", image=merged_metrics$legend)
## Writing the image to: illustrator_input/01_legend.pdf and calling dev.off().
pp(file="illustrator_input/02_raw_libsize.pdf", image=merged_metrics$libsize)
## Writing the image to: illustrator_input/02_raw_libsize.pdf and calling dev.off().

prepost <- plot_libsize_prepost(merged_expt)
pp(file="illustrator_input/03_libsize_changed_lowgenes.pdf", image=prepost$lowgene_plot)
## Writing the image to: illustrator_input/03_libsize_changed_lowgenes.pdf and calling dev.off().

pp(file="illustrator_input/04_nonzero_genes.pdf", image=merged_metrics$nonzero)
## Writing the image to: illustrator_input/04_nonzero_genes.pdf and calling dev.off().

pp(file="illustrator_input/05_raw_boxplot.pdf", image=merged_metrics$boxplot)
## Writing the image to: illustrator_input/05_raw_boxplot.pdf and calling dev.off().

pp(file="illustrator_input/06_raw_density.pdf", image=merged_metrics$density)
## Writing the image to: illustrator_input/06_raw_density.pdf and calling dev.off().

pp(file="illustrator_input/07_raw_boxplot.pdf", image=merged_metrics$boxplot)
## Writing the image to: illustrator_input/07_raw_boxplot.pdf and calling dev.off().

pp(file="illustrator_input/08_topn.pdf", image=merged_metrics$topnplot)
## Writing the image to: illustrator_input/08_topn.pdf and calling dev.off().

4 Normalize and visualize

Other metrics are more useful when used with data on the log scale and normalized by number of reads/library and/or by quantile.

merged_norm <- normalize_expt(merged_expt, transform="log2", convert="cpm", filter=TRUE)
## This function will replace the expt$expressionset slot with:
## log2(cpm(hpgl(data)))
## It backs up the current data into a slot named:
##  expt$backup_expressionset. It will also save copies of each step along the way
##  in expt$normalized with the corresponding libsizes. Keep the libsizes in mind
##  when invoking limma.  The appropriate libsize is the non-log(cpm(normalized)).
##  This is most likely kept at:
##  'new_expt$normalized$intermediate_counts$normalization$libsizes'
##  A copy of this may also be found at:
##  new_expt$best_libsize
## Leaving the data unnormalized.  This is necessary for DESeq, but
##  EdgeR/limma might benefit from normalization.  Good choices include quantile,
##  size-factor, tmm, etc.
## Not correcting the count-data for batch effects.  If batch is
##  included in EdgerR/limma's model, then this is probably wise; but in extreme
##  batch effects this is a good parameter to play with.
## Step 1: performing count filter with option: hpgl
## Removing 1030 low-count genes (6095 remaining).
## Step 2: not normalizing the data.
## Step 3: converting the data with cpm.
## Step 4: transforming the data with log2.
## transform_counts: Found 1152 values equal to 0, adding 1 to the matrix.
## Step 5: not doing batch correction.
merged_metrics <- graph_metrics(merged_norm)
## Graphing number of non-zero genes with respect to CPM by library.
## Graphing library sizes.
## Graphing a boxplot.
## Graphing a correlation heatmap.
## Graphing a standard median correlation.
## Performing correlation.
## Graphing a distance heatmap.
## Graphing a standard median distance.
## Performing distance.
## Graphing a PCA plot.
## Graphing a T-SNE plot.
## Plotting a density plot.
## Plotting the representation of the top-n genes.
## Printing a color to condition legend.

The data should now be normalized, lets view some metrics post-facto.

pp(file="illustrator_input/09_norm_corheat.pdf", image=merged_metrics$corheat)
## Writing the image to: illustrator_input/09_norm_corheat.pdf and calling dev.off().

pp(file="illustrator_input/10_norm_disheat.pdf", image=merged_metrics$disheat)
## Writing the image to: illustrator_input/10_norm_disheat.pdf and calling dev.off().

## It appears that just the normalization is sufficient to split the samples completely by type and deeply separate them from the heterologous samples

pp(file="illustrator_input/11_norm_smc.pdf", image=merged_metrics$smc)
## Writing the image to: illustrator_input/11_norm_smc.pdf and calling dev.off().

pp(file="illustrator_input/12_norm_smd.pdf", image=merged_metrics$smd)
## Writing the image to: illustrator_input/12_norm_smd.pdf and calling dev.off().

## The samples are very well behaved, none fall below the red line.

pp(file="illustrator_input/13_norm_pca.pdf", image=merged_metrics$pcaplot)
## Writing the image to: illustrator_input/13_norm_pca.pdf and calling dev.off().

pp(file="illustrator_input/14_norm_tsne.pdf", image=merged_metrics$tsneplot)
## Writing the image to: illustrator_input/14_norm_tsne.pdf and calling dev.off().

## The homogeneous wt/mutant are nicely separated, and what is more, the exogeneous samples also split wt/mutant, that might prove to be quite useful.

5 Now look at the data without batch ‘E2B1’

nor_norm <- sm(normalize_expt(merged_nor, transform="log2", convert="cpm", norm="quant", filter=TRUE))
nor_plots <- sm(graph_metrics(nor_norm))
nor_batchnorm <- sm(normalize_expt(merged_nor, transform="log2", convert="cpm", norm="quant", filter=TRUE,
                                batch="fsva"))
nor_batchplots <- sm(graph_metrics(nor_batchnorm))

6 Try many batch estimation methods to see what is up.

I do not think any of this information is currently being used, so I am going to stop running it for the moment but keep it here in case we decide to revive it.

merged_pcabatch <- sm(normalize_expt(merged_expt, transform="log2",
                                filter=TRUE, batch="pca", low_to_zero=TRUE))
merged_pcabatch_metrics <- sm(graph_metrics(merged_pcabatch))

merged_nor_pcabatch <- sm(normalize_expt(merged_nor, transform="log2",
                                filter=TRUE, batch="pca", low_to_zero=TRUE))
merged_nor_pcabatch_metrics <- sm(graph_metrics(merged_nor_pcabatch))

merged_nos_pcabatch <- sm(normalize_expt(merged_nos, transform="log2",
                                filter=TRUE, batch="pca", low_to_zero=TRUE))
merged_nos_pcabatch_metrics <- sm(graph_metrics(merged_nos_pcabatch))

merged_sva <- sm(normalize_expt(merged_expt, transform="log2",
                                filter=TRUE, batch="sva", low_to_zero=TRUE))
merged_sva_metrics <- sm(graph_metrics(merged_sva))

merged_nor_sva <- sm(normalize_expt(merged_nor, transform="log2",
                                filter=TRUE, batch="sva", low_to_zero=TRUE))
merged_nor_sva_metrics <- sm(graph_metrics(merged_nor_sva))

merged_nos_sva <- sm(normalize_expt(merged_nos, transform="log2",
                                filter=TRUE, batch="sva", low_to_zero=TRUE))
merged_nos_sva_metrics <- sm(graph_metrics(merged_nos_sva))

merged_combat <- sm(normalize_expt(merged_expt, transform="log2",
                                   filter=TRUE, batch="combat_scale", low_to_zero=TRUE))
merged_combat_metrics <- sm(graph_metrics(merged_combat))

merged_nor_combat <- sm(normalize_expt(merged_nor, transform="log2",
                                   filter=TRUE, batch="combat_scale", low_to_zero=TRUE))
merged_nor_combat_metrics <- sm(graph_metrics(merged_nor_combat))

merged_nos_combat <- sm(normalize_expt(merged_nos, transform="log2",
                                   filter=TRUE, batch="combat_scale", low_to_zero=TRUE))
merged_nos_combat_metrics <- sm(graph_metrics(merged_nos_combat))

merged_limma <- sm(normalize_expt(merged_expt, transform="log2",
                                   filter=TRUE, batch="limmaresid", low_to_zero=TRUE))
merged_limma_metrics <- sm(graph_metrics(merged_limma))

merged_nor_limma <- sm(normalize_expt(merged_nor, transform="log2",
                                   filter=TRUE, batch="limmaresid", low_to_zero=TRUE))
merged_nor_limma_metrics <- sm(graph_metrics(merged_nor_limma))

merged_nos_limma <- sm(normalize_expt(merged_nos, transform="log2",
                                   filter=TRUE, batch="limmaresid", low_to_zero=TRUE))
merged_nos_limma_metrics <- sm(graph_metrics(merged_nos_limma))

merged_fsva <- sm(normalize_expt(merged_expt, transform="log2",
                              filter=TRUE, batch="fsva", low_to_zero=TRUE))
merged_fsva_metrics <- sm(graph_metrics(merged_fsva))

merged_nor_fsva <- sm(normalize_expt(merged_nor, transform="log2",
                              filter=TRUE, batch="fsva", low_to_zero=TRUE))
merged_nor_fsva_metrics <- sm(graph_metrics(merged_nor_fsva))

merged_nos_fsva <- sm(normalize_expt(merged_nos, transform="log2",
                              filter=TRUE, batch="ssva", low_to_zero=TRUE))
merged_nos_fsva_metrics <- sm(graph_metrics(merged_nos_fsva))

6.0.1 The resulting plots

Why is is suddenly printing stuff now and not before?

merged_pcabatch_metrics$pcaplot

merged_nor_pcabatch_metrics$pcaplot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse

merged_nos_pcabatch_metrics$pcaplot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse

merged_sva_metrics$pcaplot

merged_nor_sva_metrics$pcaplot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse

merged_nos_sva_metrics$pcaplot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse

merged_combat_metrics$pcaplot

merged_nor_combat_metrics$pcaplot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse

merged_nos_combat_metrics$pcaplot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse

merged_limma_metrics$pcaplot

merged_nor_limma_metrics$pcaplot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse

merged_nos_limma_metrics$pcaplot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse

merged_fsva_metrics$pcaplot

merged_nor_fsva_metrics$pcaplot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse

merged_nos_fsva_metrics$pcaplot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse

7 Write the expt

## hmm I was going to silence this line, but looking at the report it seems to be missing some pictures.
fun_nor <- write_expt(merged_nor, excel=paste0("excel/samples_written_merged_nor-v", ver, ".xlsx"),
                      filter=TRUE, norm="raw", convert="cpm", batch="fsva", transform="log2")
## Writing the legend.
## Writing the raw reads.
## Graphing the raw reads.

## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Writing the normalized reads.
## Graphing the normalized reads.

## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Writing the median reads by factor.
## The factor WT has 6 rows.
## The factor cbf5_D95A has 8 rows.
## The factor upf1d has 4 rows.
## The factor cbf5_D95A upf1d has 2 rows.

---
title: "S. cerevisiae sample estimation, merged sample edition (20180606)."
author: "atb abelew@gmail.com"
date: "`r Sys.Date()`"
output:
 html_document:
  code_download: true
  code_folding: show
  fig_caption: true
  fig_height: 7
  fig_width: 7
  highlight: default
  keep_md: false
  mode: selfcontained
  number_sections: true
  self_contained: true
  theme: readable
  toc: true
  toc_float:
    collapsed: false
    smooth_scroll: false
---

<style>
  body .main-container {
    max-width: 1600px;
  }
</style>

```{r options, include=FALSE}
if (!isTRUE(get0("skip_load"))) {
  library("hpgltools")
  tt <- devtools::load_all("~/hpgltools")
  knitr::opts_knit$set(progress=TRUE,
                       verbose=TRUE,
                       width=90,
                       echo=TRUE)
  knitr::opts_chunk$set(error=TRUE,
                        fig.width=8,
                        fig.height=8,
                        dpi=96)
  old_options <- options(digits=4,
                         stringsAsFactors=FALSE,
                         knitr.duplicate.label="allow")
  ##ggplot2::theme_set(ggplot2::theme_bw(base_size=10))
  set.seed(1)
  ver <- "20180606"
  previous_file <- paste0("01_annotation-v", ver, ".Rmd")

  tmp <- try(sm(loadme(filename=paste0(gsub(pattern="\\.Rmd", replace="", x=previous_file), "-v", ver, ".rda.xz"))))
  rmd_file <- paste0("02_sample_estimation_merged-v", ver, ".Rmd")
}
```

S. cerevisiae sample estimation of all samples (old and new)
============================================================

This document should make clear the suitability of our yeast data for differential expression
analyses.  It should also give some ideas about the depth and distribution of
the data.

# Gathering samples

Later, we will likely choose to exclude one of the three experimental batches in
the data.  Therefore, I am making three expressionsets, one with all data, and
one the various subsets.

```{r gather_all}
merged_expt <- sm(create_expt(metadata="sample_sheets/all_samples.xlsx",
                              gene_info=sc_all_annotations,
                              file_column="allfile"))
## Don't forget, I need to change the condition names.
only_old <- subset_expt(merged_expt, subset="batch=='E1'")
merged_new <- subset_expt(merged_expt, subset="batch!='E1'")
merged_nor <- subset_expt(merged_expt, subset="batch!='E2B1'")
merged_nos <- subset_expt(merged_expt, subset="batch!='E2B2'")
```

# Visualizing raw data

There are lots of methods we have to examine raw data and explore stuff like
batch effects or non-canonical distributions or skewed counts.  hpgltools
provides some functionality to make this process easier.  The graphs shown below
and many more are generated with the wrapper 'graph_metrics()', when invoked it
provides a list of plots including: library sizes, the number of non-zero genes,
density/box plots by sample, distance/correlation heatmaps, standard median
correlation/distance, PCA/TSNE clustering, top-n genes, a legend describing the
colors/symbols used, and some tables describing the data.  Optionally, it can
also perform quantile-quantile plots showing the distribution of each sample
vs. the median of all samples and MA plots of the same.

Caveat: some plots do not work well with gene IDs that are all-0, thus I first
filter the data to remove them.

I also added a neat function 'plot_libsize_prepost()' which shows how many genes
are poorly represented before/after filtering the data.

The plots printed here are all metrics which are useful when considering raw
data.

```{r raw_explore}
merged_filt <- sm(normalize_expt(merged_expt, filter=TRUE))
merged_metrics <- sm(graph_metrics(merged_expt))

pp(file="illustrator_input/01_legend.pdf", image=merged_metrics$legend)
pp(file="illustrator_input/02_raw_libsize.pdf", image=merged_metrics$libsize)

prepost <- plot_libsize_prepost(merged_expt)
pp(file="illustrator_input/03_libsize_changed_lowgenes.pdf", image=prepost$lowgene_plot)

pp(file="illustrator_input/04_nonzero_genes.pdf", image=merged_metrics$nonzero)
pp(file="illustrator_input/05_raw_boxplot.pdf", image=merged_metrics$boxplot)
pp(file="illustrator_input/06_raw_density.pdf", image=merged_metrics$density)
pp(file="illustrator_input/07_raw_boxplot.pdf", image=merged_metrics$boxplot)
pp(file="illustrator_input/08_topn.pdf", image=merged_metrics$topnplot)
```

# Normalize and visualize

Other metrics are more useful when used with data on the log scale and
normalized by number of reads/library and/or by quantile.

```{r normalize, fig.show="hide"}
merged_norm <- normalize_expt(merged_expt, transform="log2", convert="cpm", filter=TRUE)
merged_metrics <- graph_metrics(merged_norm)
```

The data should now be normalized, lets view some metrics post-facto.

```{r normviz}
pp(file="illustrator_input/09_norm_corheat.pdf", image=merged_metrics$corheat)
pp(file="illustrator_input/10_norm_disheat.pdf", image=merged_metrics$disheat)
## It appears that just the normalization is sufficient to split the samples completely by type and deeply separate them from the heterologous samples

pp(file="illustrator_input/11_norm_smc.pdf", image=merged_metrics$smc)
pp(file="illustrator_input/12_norm_smd.pdf", image=merged_metrics$smd)
## The samples are very well behaved, none fall below the red line.

pp(file="illustrator_input/13_norm_pca.pdf", image=merged_metrics$pcaplot)
pp(file="illustrator_input/14_norm_tsne.pdf", image=merged_metrics$tsneplot)
## The homogeneous wt/mutant are nicely separated, and what is more, the exogeneous samples also split wt/mutant, that might prove to be quite useful.
```

## Print some variance partition information

This is an interesting aside which came up last week for some other data.  It
might be good to include the % variance correlated with 'condition' as a column
in the annotation data for the expressionset.  Thus, when we do the differential
expression analysis later, we can look and see if a 'significant' gene has
variance which is actually correlated with condition.

Since writing these analyses, I implemented this idea and so am including it here.

```{r variance_partitions}
vp <- varpart(merged_expt, predictor=NULL, factors=c("condition", "batch"))
pp(file="illustrator_input/15_merged_varpart_partition.pdf", image=vp$partition_plot)
## Check out the last two columns!
head(fData(vp$modified_expt))

merged_sva <- sm(normalize_expt(merged_expt, transform="log2",
                                filter=TRUE, batch="sva", low_to_zero=TRUE))
vpsva <- varpart(merged_sva, predictor=NULL, factors=c("condition", "batch"))
pp(file="illustrator_input/16_merged_varpart_sva_partition.pdf", image=vpsva$partition_plot)
head(fData(vpsva$modified_expt))
```

# Now look at the data without batch 'E2B1'

```{r nor, fig.show="hide"}
nor_norm <- sm(normalize_expt(merged_nor, transform="log2", convert="cpm", norm="quant", filter=TRUE))
nor_plots <- sm(graph_metrics(nor_norm))
nor_batchnorm <- sm(normalize_expt(merged_nor, transform="log2", convert="cpm", norm="quant", filter=TRUE,
                                batch="fsva"))
nor_batchplots <- sm(graph_metrics(nor_batchnorm))
```

## Print the smaller set of data

```{r nor_show}
## Before fsva, correlation heatmap without 'r'
pp(file="illustrator_input/17_nor_corheat.pdf", image=nor_plots$corheat)
## Before fsva, pca without 'r'
pp(file="illustrator_input/18_nor_pca.pdf", image=nor_plots$pcaplot)
## After fsva, correlation heatmap without 'r'
pp(file="illustrator_input/19_nor_fsva_corheat.pdf", image=nor_batchplots$corheat)
## After fsva, pca without 'r'
pp(file="illustrator_input/20_nor_fsva_pca.pdf", image=nor_batchplots$pcaplot)
```

# Try many batch estimation methods to see what is up.

I do not think any of this information is currently being used, so I am going to
stop running it for the moment but keep it here in case we decide to revive it.

```{r batch_exo, fig.show=FALSE}
merged_pcabatch <- sm(normalize_expt(merged_expt, transform="log2",
                                filter=TRUE, batch="pca", low_to_zero=TRUE))
merged_pcabatch_metrics <- sm(graph_metrics(merged_pcabatch))

merged_nor_pcabatch <- sm(normalize_expt(merged_nor, transform="log2",
                                filter=TRUE, batch="pca", low_to_zero=TRUE))
merged_nor_pcabatch_metrics <- sm(graph_metrics(merged_nor_pcabatch))

merged_nos_pcabatch <- sm(normalize_expt(merged_nos, transform="log2",
                                filter=TRUE, batch="pca", low_to_zero=TRUE))
merged_nos_pcabatch_metrics <- sm(graph_metrics(merged_nos_pcabatch))

merged_sva <- sm(normalize_expt(merged_expt, transform="log2",
                                filter=TRUE, batch="sva", low_to_zero=TRUE))
merged_sva_metrics <- sm(graph_metrics(merged_sva))

merged_nor_sva <- sm(normalize_expt(merged_nor, transform="log2",
                                filter=TRUE, batch="sva", low_to_zero=TRUE))
merged_nor_sva_metrics <- sm(graph_metrics(merged_nor_sva))

merged_nos_sva <- sm(normalize_expt(merged_nos, transform="log2",
                                filter=TRUE, batch="sva", low_to_zero=TRUE))
merged_nos_sva_metrics <- sm(graph_metrics(merged_nos_sva))

merged_combat <- sm(normalize_expt(merged_expt, transform="log2",
                                   filter=TRUE, batch="combat_scale", low_to_zero=TRUE))
merged_combat_metrics <- sm(graph_metrics(merged_combat))

merged_nor_combat <- sm(normalize_expt(merged_nor, transform="log2",
                                   filter=TRUE, batch="combat_scale", low_to_zero=TRUE))
merged_nor_combat_metrics <- sm(graph_metrics(merged_nor_combat))

merged_nos_combat <- sm(normalize_expt(merged_nos, transform="log2",
                                   filter=TRUE, batch="combat_scale", low_to_zero=TRUE))
merged_nos_combat_metrics <- sm(graph_metrics(merged_nos_combat))

merged_limma <- sm(normalize_expt(merged_expt, transform="log2",
                                   filter=TRUE, batch="limmaresid", low_to_zero=TRUE))
merged_limma_metrics <- sm(graph_metrics(merged_limma))

merged_nor_limma <- sm(normalize_expt(merged_nor, transform="log2",
                                   filter=TRUE, batch="limmaresid", low_to_zero=TRUE))
merged_nor_limma_metrics <- sm(graph_metrics(merged_nor_limma))

merged_nos_limma <- sm(normalize_expt(merged_nos, transform="log2",
                                   filter=TRUE, batch="limmaresid", low_to_zero=TRUE))
merged_nos_limma_metrics <- sm(graph_metrics(merged_nos_limma))

merged_fsva <- sm(normalize_expt(merged_expt, transform="log2",
                              filter=TRUE, batch="fsva", low_to_zero=TRUE))
merged_fsva_metrics <- sm(graph_metrics(merged_fsva))

merged_nor_fsva <- sm(normalize_expt(merged_nor, transform="log2",
                              filter=TRUE, batch="fsva", low_to_zero=TRUE))
merged_nor_fsva_metrics <- sm(graph_metrics(merged_nor_fsva))

merged_nos_fsva <- sm(normalize_expt(merged_nos, transform="log2",
                              filter=TRUE, batch="ssva", low_to_zero=TRUE))
merged_nos_fsva_metrics <- sm(graph_metrics(merged_nos_fsva))
```

### The resulting plots

Why is is suddenly printing stuff now and not before?

```{r batch_estimation_plots}
merged_pcabatch_metrics$pcaplot
merged_nor_pcabatch_metrics$pcaplot
merged_nos_pcabatch_metrics$pcaplot
merged_sva_metrics$pcaplot
merged_nor_sva_metrics$pcaplot
merged_nos_sva_metrics$pcaplot
merged_combat_metrics$pcaplot
merged_nor_combat_metrics$pcaplot
merged_nos_combat_metrics$pcaplot
merged_limma_metrics$pcaplot
merged_nor_limma_metrics$pcaplot
merged_nos_limma_metrics$pcaplot
merged_fsva_metrics$pcaplot
merged_nor_fsva_metrics$pcaplot
merged_nos_fsva_metrics$pcaplot
```

# Write the expt

```{r fsva_written}
## hmm I was going to silence this line, but looking at the report it seems to be missing some pictures.
fun_nor <- write_expt(merged_nor, excel=paste0("excel/samples_written_merged_nor-v", ver, ".xlsx"),
                      filter=TRUE, norm="raw", convert="cpm", batch="fsva", transform="log2")
```

```{r saveme, include=FALSE}
if (!isTRUE(get0("skip_load"))) {
  pander::pander(sessionInfo())
  message(paste0("This is hpgltools commit: ", get_git_commit()))
  this_save <- paste0(gsub(pattern="\\.Rmd", replace="", x=rmd_file), "-v", ver, ".rda.xz")
  tmp <- sm(saveme(filename=this_save))
}
```
